Alignment-free Prediction of Ribonucleases using a Computational Chemistry approach: Comparison with HMM model and Isolation from Schizosaccharomyces pombe, Prediction, and Experimental assay of a new sequence
نویسندگان
چکیده
The study of type III RNases constitutes an important area in molecular biology. It is known that the pac1 gene encodes a particular RNase III that shares low amino acid similarity with other genes despite having a double-stranded ribonuclease activity. Bioinformatics methods based on sequence alignment may fail when there is a low amino acidic identity percentage between query sequence and others with similar functions (remote homologues) or a similar sequence is not recorded in the database. Quantitative StructureActivity Relationships (QSAR) applied to protein sequences may allow an alignment-independent prediction of protein function. These sequences QSAR like methods often use 1D sequence numerical parameters as the input to seek sequence-function relationships. However, previous 2D representation of sequences may uncover useful higher-order information. In the work described here we calculated for the first time the Spectral Moments of a Markov Matrix (MMM) associated with a 2D-HP-map of a protein sequence. We used MMMs values to characterize numerically 81 sequences of type III RNases and 133 proteins of a control group. We subsequently developed one MMM-QSAR and one classic Hidden Markov Model (HMM) based on the same data. The MMM-QSAR showed a discrimination power of RNAses from other proteins of 97.35% without using alignment, which is a result as good as for the known HMM techniques. We also report for the first time the isolation of a new Pac1 protein (DQ647826) from Schizosaccharomyces pombe, strain 428-4-1. The MMM-QSAR model predicts the new RNase III with the same accuracy as other classical alignment methods. Experimental assay of this protein confirms the predicted activity. The present results suggest that MMM-QSAR models may be used for protein function annotation avoiding sequence alignment with the same accuracy of classic HMM models. Corresponding author: Agüero-Chapin, G,: Centro of Chemical Biactives and Faculty of Chemistry and Pharmacy, Central University of Las Villas, Santa Clara, 54830, Cuba, E-mail: [email protected]
منابع مشابه
A New Surface Tension Model for Prediction of Interaction Energy between Components and Activity Coefficients in Binary Systems
In this work, we develop a correlative model based on the surface tension data in order to calculate thermodynamic parameters, such as interaction energy between components (Uij), activity coefficients and etc. In the new approach, by using Li et al. (LWW) model, a three-parameter surface tension equation is derived for liquid mixtures. The surface tension data of 54 aqueous and 73 non-aqueous ...
متن کاملQuantitative Modeling for Prediction of Critical Temperature of Refrigerant Compounds
The quantitative structure-property relationship (QSPR) method is used to develop the correlation between structures of refrigerants (198 compounds) and their critical temperature. Molecular descriptors calculated from structure alone were used to represent molecular structures. A subset of the calculated descriptors selected using a genetic algorithm (GA) was used in the QSPR model development...
متن کاملgpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملPrediction of accurate pKa values of some α-substituted carboxylic acids with low cost of computational methods
The acidity constants (pKa) of thirty four (34) ;-substituted carboxylic acids in aqueous solution havebeen calculated using conductor-like polarizable continuum (C-PCM) solvation model. The gasphaseenergies at the Density Functional Theory (DFT-MPW1PW91) and solvation energies atHartree Fock (HF) are combined to estimate the pKa values which are very close to the experimentalvalues where, and ...
متن کاملFlow Variables Prediction Using Experimental, Computational Fluid Dynamic and Artificial Neural Network Models in a Sharp Bend
Bend existence induces changes in the flow pattern, velocity profiles and water surface. In the present study, based on experimental data, first three-dimensional computational fluid dynamic (CFD) model is simulated by using Fluent two-phase (water + air) as the free surface and the volume of fluid method, to predict the two significant variables (velocity and channel bed pressure) in 90º sharp...
متن کامل